Sparse clustering, which aims to find a proper partition of an extremelyhigh-dimensional data set with redundant noise features, has been attractedmore and more interests in recent years. The existing studies commonly solvethe problem in a framework of maximizing the weighted feature contributionssubject to a $\ell_2/\ell_1$ penalty. Nevertheless, this framework has twoserious drawbacks: One is that the solution of the framework unavoidablyinvolves a considerable portion of redundant noise features in many situations,and the other is that the framework neither offers intuitive explanations onwhy this framework can select relevant features nor leads to any theoreticalguarantee for feature selection consistency. In this article, we attempt to overcome those drawbacks through developing anew sparse clustering framework which uses a $\ell_{\infty}/\ell_0$ penalty.First, we introduce new concepts on optimal partitions and noise features forthe high-dimensional data clustering problems, based on which the previouslyknown framework can be intuitively explained in principle. Then, we apply thesuggested $\ell_{\infty}/\ell_0$ framework to formulate a new sparse k-meansmodel with the $\ell_{\infty}/\ell_0$ penalty ($\ell_0$-k-means for short). Wepropose an efficient iterative algorithm for solving the $\ell_0$-k-means. Todeeply understand the behavior of $\ell_0$-k-means, we prove that the solutionyielded by the $\ell_0$-k-means algorithm has feature selection consistencywhenever the data matrix is generated from a high-dimensional Gaussian mixturemodel. Finally, we provide experiments with both synthetic data and the AllenDeveloping Mouse Brain Atlas data to support that the proposed $\ell_0$-k-meansexhibits better noise feature detection capacity over the previously knownsparse k-means with the $\ell_2/\ell_1$ penalty ($\ell_1$-k-means for short).
展开▼